model-free system
Sample-efficient AI
Since AlphaGo, AI researchers have recognized the promise of integrating reinforcement learning with search methods, which involve considering many potential next actions available to an RL agent, and simulating what their results might be before choosing one. This starts to mimic human deliberation much more closely, by explicitly introducing elements of "planning" into the RL paradigm. Yang attributes the huge performance improvements of AlphaGo, AlphaZero and MuZero to this search process. Another important distinction in RL is between model-based systems, which construct explicit models of their environments, and model-free systems, which don't. Prior to AlphaGo, just about all leading RL work was done on model-free systems (PPO and deep Q learning, for example). Model-based systems just weren't practical because the learning environment models is hard, and adds a significant layer of complexity on top of the simpler action selection task that model-free systems could focus on exclusively.
AlphaGo: Did DeepMind Just Solve Intelligence?!
Just recently, DeepMind's AlphaGo won a series of Go matches against a top-level human opponent. This victory has caused a mix of excitement and consternation. Are we seeing another case of a bigger and faster machine pushing the edge of performance, or are we perhaps approaching a fundamental crisis of "cognitive competition?" To answer this questions, we look at the succession of game-playing computers, and then explore the rise of "model-free methods" and what it foretells for our future. We have become used to the idea that purpose-built machines can surpass humans in almost any physical task.